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ERROR PATTERNS FROM ALTERNATIVE 
COST PROGRESS MODELS 

ABSTRACT 

Numerous cost progress models have been offered in the 
literature and used in practice. This paper selects five cost 
progress models which predict future cost using various 
combinations of three factors (past cost, cumulative quantity, and 
production rate), and investigates the forecast accuracy of the 
models under varying circumstances. The broad objectives are to 
(1) identify conditions which may affect model accuracy, 
documenting the manner in which forecast errors for each model 
depend on those conditions, and (2) suggest which of the five 
models may be more or less accurate under a given set of 
conditions. Particular attention is paid to how model accuracy is 
affected by one specific condition — changes in production rate. 



ERROR PATTERNS FROM ALTERNATIVE 
COST PROGRESS MODELS 



INTRODUCTION 

Cost progress models have proven their value in estimating 
tasks encountered in production, purchasing and the management of 
other organizational operations. Going by various names (e.g. 
"experience curves", "learning curves", "cost improvement curves"), 
cost progress models have long been accepted as a useful tool for 
planning, estimating, and predicting the pattern of costs expected 
from a repetitive production or acquisition process. Various cost 
progress models exist but most such models are versions of the 
standard learning curve, perhaps with additional variables added to 
improve explanatory power and forecast accuracy. 

How accurate are various cost progress models? Does their 
accuracy depend on the conditions surrounding their use? Are 
particular cost progress models more accurate in some circumstances 
and other models more accurate in other circumstances? The purpose 
of this paper is to document the accuracy of a set of common cost 
progress models under various circumstances, indicating variables 
that may impact model accuracy, and highlighting situations when 
model accuracy may be expected to improve or deteriorate . 

RELATED RESEARCH 

The literature on cost progress models/learning curves is 
substantial . 1 Three branches of research are relevant to the 
current study. The first branch has to do with alternative forms 
of cost progress models and alternative variables suggested for 
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inclusion. Most cost progress models start (some end) with some 
version of the familiar learning curve. The premise of the 
learning curve is that cumulative quantity is the primary cause of 
changes in unit cost during a production or acquisition program. 
There is general acknowledgement that cumulative quantity is only 
a partial explanation and hence much prior research has attempted 
to augment learning models with other variables. Some attention 
has been paid to variables reflecting changes in fixed costs 
associated with capacity (e.g., Balut, 1981; Balut, et.al., 1989; 
Moses, 1990), but the greatest amount of attention has been paid to 
changes in production rate. 2 

Conceptually production rate is argued to affect unit cost due 
to economies (or diseconomies) of scale (e.g., Bemis, 1981; Boger 
and Liao, 1990; Large, et. al., 1974; Linder and Willbourn, 1973). 
Empirically, evidence on the benefit of including production rate 
variables in cost progress models is mixed. Various studies (e.g., 
Alchian, 1963; Cochran, 1960; Hirsh, 1952; Large, Campbell and 
Cates, 1976) found little or no significance for rate variables. 
Other studies did document significant rate/cost relationships 
(e.g., Bemis, 1981; Cox and Gansler, 1990). In reviewing the 
existing research on production rate. Smith (1980) concluded that 
a rate/cost relationship may exist but that the existence, strength 
and nature of the relationship varies with the item produced and 
the cost element examined. 3 Collectively, this branch of 
literature suggests that inclusion of variables, such as production 
rate, in cost progress models sometimes has improved cost 
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explanation — but not always. It is relevant here because the 
present research selects a representative number of cost progress 
models from the existing literature and investigates their accuracy 
under various conditions. 

The second branch of literature has been concerned with 
identifying factors that cause or influence the nature of the 
learning or cost improvement phenomenon, with attention paid to a 
wide variety of behavioral, organizational and process variables. 
Conway and Schultz's (1959) classic paper is an early example. 
Dutton and Thomas (1984) provide a typology of factors causing 
learning, dividing these factors into categories based on origin 
and type. Adler and Clark (1991) provide a step toward modeling 
the links between selected causal factors and resultant learning. 
This branch of literature is relevant to the current paper because 
it documents how cost improvement patterns are inevitably 
influenced by a host of variables. It implicitly acknowledges 
that the ability of cost progress models to adequately describe 
cost/output relationships will depend on these factors. In short, 
this literature implies that model forecast accuracy (irrespective 
of the form of model selected) will be conditional on 
circumstances . 

The third branch of literature is concerned with explicitly 
examining cost progress model accuracy under various conditions. 
Sraunt (1986) compared learning curve models to naive and moving 
average models, finding that relative accuracy depended on such 
factors as learning rate and forecast horizon. Moses (1991, 1992) 
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examined learning curve and rate adjustment models, concluding that 
relative forecast accuracy and bias were dependent on a collection 
of variables, including variations in production rate, in factory 
burden, in data availability, as well as other factors. These 
studies are relevant here because they explicitly identify 
situations where cost progress models can be expected to be 
comparatively more or less accurate, one question of interest in 
the present study. Some of the conditions examined in these 
studies, conditions expected to influence cost progress model 
accuracy, are re-examined here. However, each of these prior 
studies observed accuracy using simulated data under well- 
controlled experimental conditions. Their results should perhaps 
be seen as hypotheses about how model accuracy may behave in 
practice with actual data. Observing the accuracy of various cost 
progress models, under various conditions, when applied to data 
from actual programs is the objective of this study. 

ALTERNATIVE COST PROGRESS MODELS 

Consider the central purpose of a cost progress model. It is 
not really a model that explains cost per se. (It says nothing 
about the absolute amount of cost.) Rather its purpose is to 
explain the relationship between costs at different points during 
a repetitive production/acquisition process. Every cost progress 
model rests on two assumptions: (1) that future cost depends on 
past cost, and (2) that future cost differs systematically from 
past cost as a function of changing conditions during the 
repetitive process. Alternative models differ primarily in which 
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"changing conditions" the modeler sees as sufficiently important to 
be included in the model . The most common cost progress model is 
the learning curve, which assumes that future cost systematically 
differs from past cost as a function of "experience", measured by 
cumulative output. The most common modification of the learning 
curve is, as mentioned previously, the incorporation of a term to 
reflect production rate, which assumes additionally that future 
cost systematically differs from past cost as a function of output 
per period. 

This study investigates the accuracy of cost progress models 
that include the three variables just mentioned: (1) past cost, (2) 
cumulative quantity, and (3) production rate. Selectively 
combining these variables, four possibilities exist: 

a) Future cost= f (past cost) 

b) Future cost= f (past cost, cumulative quantity) 

c) Future cost= f (past cost, production rate) 

d) Future cost= f (past cost, cumulative quantity, production 

rate) 

One model each from groups a, b, and c, and two models from group 
d, are investigated. 

1. Random Walk fRW^ Model : The simplest of all, the random 
walk model assumes that future cost is equal to the most recent 
past cost: 

C t = C w (1) 

where 

C = unit cost 
t = sequencing subscript 
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This naive model serves as a benchmark for assessing the accuracy 
gained by including additional variables. 

2. Learning Curve CLO Model : The familiar learning curve 4 

is the model used for incorporating "experience” into the 
prediction . 

C t = C, Q t b (2) 

where 

C x = theoretical first unit cost 

Q = cumulative quantity produced 

b = a parameter, the learning curve exponent or slope 
C, t = as before 

3. Rate Adjustment (RA~) Model : The assumption of the rate 

adjustment model is that future cost is equal to past cost, 
adjusted for any change in production rate (production volume per 
period) . 

C t = C t . 1 A t (3) 

^ is an adjustment factor capturing the impact of production rate 
on the spreading of fixed costs. 






— ^ F + (1 - F) 

Rt 



(3a) 



where 

A = adjustment factor 

R = production rate per period 

F = proportion of cost represented by fixed overhead 5 

C, t = as before 

Unit cost is assumed to vary inversely with production rate due to 
the spreading of fixed overhead cost over differing volume. Thus 
unit cost will change as production rate (R) changes — and the 
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degree of change will depend on the proportion of fixed overhead 
cost in total cost (F). The adjustment factor is a version of an 
"overhead redistribution" model developed by Balut (1981). 6 

4. Bemis (BEl Learning /Rate Model : This is the first model 
presented here which considers (1) past cost, (2) cumulative 
quantity and (3) production rate. It is the most widely used 
model incorporating these three variables and was developed by 
augmenting the traditional learning curve with an analogous 
production rate term. 

C t = C, Q t b R t d (4) 

where 

d = a parameter, the production rate exponent or slope 

C, c ir Qr R, b , t = as before 

Work on production rate dates at least to the 1950s (e.g., Hirsh, 
1952) and empirical work on this learning/rate model was first 
conducted by RAND (e.g.. Large, et. al., 1974), but Bemis (1981) 
has been credited with popularizing the model (the reason the Bemis 
label is used here). 

5. Balut (BA) Learning /Rate Model : This is a second model 
which considers past cost, cumulative quantity and production rate. 
It is a version of the original Balut (1981) model and combines the 
traditional learning curve (Model 2) and the rate adjustment model 
(Model 3) previously discussed. The basic premise is that, in the 
absence of production rate changes, cost would follow a traditional 
learning curve. The impact of production rate change is 
incorporated by adjusting the cost forecasts from the learning 



7 



curve model by an overhead redistribution adjustment factor. 
C t = C x Q t b A at (5) 

where 



A ac = 4 s F + (1 " F) (5a) 

K t 



and 

R a = reference production rate, average production rate for 
past lots . 

C, C 1( Q, b, t, R, F = as before 
ASSESSING ACCURACY 

The objective of the study is to investigate model accuracy 
under various conditions. The data for the study involved costs 
and quantities for successive production lots. Accuracy here is 
defined in terms of the ability of a model to correctly forecast 
the "next lot average unit cost." Accuracy in such near term cost 
forecasting is seen as being a relatively minimal requirement 
expected of a cost progress model. The basic process is quite 
simple: 

(a) Models were fit to a series of cost points to estimate 
(when necessary) model parameters. 7 

(b) Estimated models were used to forecast future (next 
period) average unit cost. 

(c) Realized actual unit costs were compared to forecasted 
costs to assess accuracy. 

It should be noted here that model accuracy centrally involves the 
ability to correctly forecast in advance, not the ability to 
explain a cost series ex post. 8 Two notions of accuracy apply. 
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One is the absolute magnitude of forecast error, regardless of 
whether the forecast is too high or too low. The second is the 
direction of the error, whether the model under or over-estimates 
future cost. Given two concepts, two measures were used: 



ERROR = 


| PUC 


- AUC| 


* AUC 


(6) 


BIAS 


(PUC 


- AUC) 


+ AUC 


(7) 



where 

PUC = predicted unit cost 
AUC = actual unit cost 

ERROR is a commonly used accuracy measure, the absolute percentage 
error. ERROR can take on only positive values and higher values, 
of course, signal poorer forecasts. BIAS takes on both positive 
and negative values. Positive (negative) values signal over 
(under) prediction of cost. 

CONDITIONS AFFECTING MODEL ACCURACY 

The general research hypothesis is that the accuracy of models 
will depend on the circumstances in which they are used. What 
circumstances might impact accuracy? Research cited above (Smunt, 
1986; Moses, 1991, 1992) suggested and discussed variables that 

might have an effect. Below such variables are listed, with a 
brief description and comment on how they were operationalized 
(measured) empirically. Collectively these variables will be 
referred to as the "condition” variables because they attempt to 
represent exogenous conditions which may affect model accuracy. 

1. Fixed Cost Burden: Total unit cost must consist of both 

variable costs and a share of the total fixed cost burden 



9 



associated with capacity. A major role of production rate is 
determining the volume of output over which fixed capacity costs 
will be spread. Hence, the importance of including a production 
rate variable in a cost model, and thus model accuracy, may depend 
on the degree to which total unit cost is made up of fixed costs. 
The following regression equation was fit to cost series data and 
the coefficient f used as a measure of fixed cost burden. 

c t = v + f 1 
R* 

This equation is consistent with seeing total unit cost per period 
(c t ) as the sum of variable cost per unit (v) plus a standard fixed 
cost per unit (f) adjusted for relative production rate per period 
(Rt) . Higher values of f would be consistent with greater fixed 
cost burden, i.e., a greater proportion of fixed cost in total 
cost. 

2. Learning Slope: Past simulation research (Smunt, 1986) 

shows that the importance of including a learning parameter in a 
cost model depends, not surprisingly, on the degree of learning 
that exists in the data. Hence, accuracy across the five models 
examined may depend on learning rate. Learning slopes were 

measured by using the b parameter estimated from model 2, 
transformed to learning rates (e.g., 90%, 80%, etc.). Higher 

values indicate less learning. 

3. Cost Variability: Costs may vary from period to period 

due to unsystematic random factors. Such random factors 

influencing cost can be expected to obscure systematic 
relationships between cost and quantity or rate variables, reducing 
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the chance that a cost model will be estimated correctly and 
forecast accurately (Smunt, 1986; Moses, 1991). Empirically, Cost 
Variability was measured by the average period-to-period (lot-to- 
lot) percentage change in average unit cost. Higher values 
indicate greater period-to-period variability in unit cost. 

4. Quantity Variability: If production rate was highly 
stable across periods, there would be little need for a rate 
variable in a cost model (and little ability to correctly estimate 
a rate parameter by fitting a model to past data). Hence, the 
importance of incorporating a rate variable into a cost model, and 
model accuracy, may depend on the degree to which production 
rate/quantity varies. Empirically, Quantity Variability was 
measured by the average period-to-period (lot-to-lot) percentage 
change in production quantity. Higher values indicate greater 
quantity variability. 

5. Quantity Trend: When initiating a production/acquisition 
program for a new item, does production rate (lot quantity) start 
at a low level and build up slowly to full capacity? Or is full 
capacity production achieved rapidly? Simulation results (Moses, 
1991) have shown that the rate at which lot quantities grow when 
initiating a program affects cost model accuracy. Does a similar 
relationship exist when using real data? Empirically, the growth 
trend in lot quantity was operationalized by dividing first lot 
quantity by the average lot quantity over the (to date) life of a 
program. Hence, it is a measure of first lot size as a proportion 
of average lot size and a crude indicator of the trend in quantity. 
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Lower values indicate greater growth in quantity relative to 
initial quantity. 

6. Plot Points: The number of data points available to 
estimate the parameters of a model may affect model accuracy. Not 
surprisingly, simulation results (Moses, 1991) show that when 
comparing the relative accuracy of models, models with fewer (more) 
parameters tend to be relatively more accurate when the number of 
observations is smaller (greater). One question is whether similar 
findings will come from real data. 

7. Future Production Rate: Once a model is estimated using 
past data, it is used to forecast future cost. Changes in 
production rate between the model estimation period and the future 
should alter future unit cost and hence a model's ability to 
forecast that future cost accurately. Cost models incorporating 
production rate variables would be expected to have some advantage 
in such situations, and the degree of advantage would be expected 
to depend on how much future production rate differs from the past. 
Empirically, a variable measuring the change in production rate was 
constructed by dividing next (future) period's rate by last (most 
recent) period's rate. (This ratio was then logged to make the 
distribution symmetrical.) Positive (negative) values indicate 
increases (decreases) in production quantities. 

SAMPLE AND DATA 

The accuracy of the cost progress models was investigated 
using data for a sample of military aircraft and missile systems 
programs taken from the U. S. Military Aircraft Cost Handbook 
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(DePuy, et. al., 1983) and the U. S. Missile Cost Handbook 



(Crawford, et. al., 1984). These handbooks contain data for 
virtually all military aircraft and missile programs from the early 
1960s through the early 1980s. Two basic data items were collected 
from the handbooks for each program: annual lot quantities and 
average airframe unit costs per lot (in 1981 constant dollars). 
Programs were deleted from consideration if there were incomplete 
data or if the programs ran less than five years (a minimum number 
of data points was needed to fit the cost progress models). Based 
on these criteria, 46 programs (32 aircraft, 14 missile) were 
included in the final sample. These programs ranged in length from 
five years to thirteen years. 

The original sample of 46 programs was "expanded" into 121 
separate cost series. This was accomplished by dividing each 
program cost series into separate individual year-to-date cost 
series. For example, if a particular program had cost data 
available for six years, say 1970-1975, this single program cost 
series would be expanded into three separate series as follows: 

Cost series #1: 1970-1973 data (used to forecast 1974 cost) 

Cost series #2: 1970-1974 data (used to forecast 1975 cost) 

Cost series #3: 1970-1975 data (used to forecast 1976 cost) 
Thus the initial cost series for each program includes the first 
four years of data, while subsequent cost series were created by 
additionally including data from the next year in the cost series. 
This approach makes maximum use of data and approximates the actual 
process of a cost estimator who would update a forecast model each 
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period to incorporate the most recent data. 



ANALYSIS AND FINDINGS 

The basic methodology used to assess cost model accuracy was 
as follows: Each of the five alternative models was estimated 
(when necessary) on each of the 121 cost series. Next-period data 
( e.g. cumulative quantity and/or production rate) was input to 
each model to forecast next-period cost. Then next-period 
forecasted cost and next-period actual cost were compared. Thus 
the process produced 121 measures of error for each of the five 
models. The analysis primarily involves describing and explaining 
(when possible) the pattern of errors observed across the different 
models and across the different circumstances (i.e., across 
different values of the seven condition variables). 

General Error Patterns - Descriptive Statistics: 

Table 1 provides selected descriptive statistics for both 
ERROR and BIAS for the five models. A general pattern is evident: 
Moving from the left to the right in the table, both magnitude of 
ERROR (mean and median) and the dispersion in ERROR (standard 
deviation and SIQR) tend to increase. Average magnitude of error 
ranges from about 13% to 25%. Note that this movement from left to 
right in the table coincides with increased complexity of the 
models: The random walk (RW) model considers only past cost in the 
forecast; the learning curve (LC) and rate adjustment (RA) models 
additionally consider either learning or production rate, but not 
both; while the Bemis (BE) and Balut (BA) models consider both 
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Table 1 

Error Statistics for Alternative Cost Progress Models 





MODELS 


Statistic 


RW 


LC 


RA 


BE 


BA 














Mean- 

absolute 

error 


.125 


.169 


.160 


.208 


.245 














Median- 

absolute 

error 


.074 


.124 


..099 


.138 


.143 














Stnd. Dev.- 

absolute 

error 


.129 


.153 


.173 


.211 


.296 














SIQR 1 - 

absolute 

error 


.126 


.169 


.146 


.273 


.230 














Mean-bias 


.049 


-.033 


.113 


.023 


.129 














Median-bias 


.016 


-.061 


.059 


-.013 


.047 



!• SIQR= Semi-interquartile range: (75th quantile - 25th quantile) 



learning and production rate. One might have hypothesized in 
advance that accuracy would improve, not deteriorate, with the 
incorporation of additional variables; that of course is the point 
of using more complex models for forecasting. 

At least three possibilities perhaps explain the contrary 
finding. First, the more complex models could simply be mis- 
specified in that the relations implied between cost, quantity and 
rate do not adequately describe reality. Forecasts from 
theoretically incorrect models would be expected to perform poorly. 
Second, the models could be correctly specified, but the amount of 
"noise" in the cost data relative to the proportion of variance in 
cost explainable by the learning or rate variables may be too high. 
Hence, parameter estimates are unreliable and forecasts poor. 
Third, the more complex models could be correctly specified but, 
because they incorporate more variables, the data in general are 
too lean (too few observations in the cost series) to estimate the 
model parameters. This is a problem of degrees of freedom. If 
this is the case, then the more complex models should perform 
better as the data become richer. This particular possibility will 
be addressed later. 

It should also be noted that more complex models incorporating 
more variables typically have greater ability to explain, ex post, 
a cost series (i.e., r 2 goes up as the number of explanatory 
variables does). Thus, the results here suggest that ex post 
explanation and ex ante forecasting need not be strongly related. 
This is consistent with previous findings for cost models from 
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simulation studies (Moses, 1993). 

Another general result from table 1 concerns bias. Values for 
BIAS tend to be positive, except for the LC model. Thus, the 
models tend to over-estimate future cost, providing forecasts that 
on average are too high. This tendency is strongest for the RA and 
BA models. In contrast, the traditional learning curve (LC) tends 
to under-estimate future cost. This finding for the learning curve 
is also consistent with previous conclusions from simulation 
studies (Moses, 1992). 

Relationship Between Accuracy and Conditions: 

Is the accuracy of the models dependent on the circumstances 
in which they are used? Do models perform well in some 
circumstances, less well in others? To get a first-cut answer to 
these guestions, three tests of the relationship between ERROR 
(from each of the five models separately) and the condition 
variables were conducted: 

1. Pairwise Correlations: This is a univariate test of 
association, where measurement errors in other variables do not 
intrude. 

2. Multiple Regression of ERROR on the Condition Variables 
together: This is a test of association for each variable while 
controlling for the others. 

3. Stepwise Regression of ERROR on the Condition Variables: 
This permits variables that maximally explain ERROR to be 
identified. (The stepwise procedure was stopped when no additional 
variable would significantly (alpha < .05) enter the regression 
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Table 2 

Test of Relationship Between Cost Progress 
Model Errors and Explanatory Conditions 



Conditions 


Test 

Statistics 


RW 


_LC 


RA 


BE 


BA 
















Burden: 


Corr. 


.13 


.19* 


.22* 


.21* 


.12 




Reg. Coef. 


.05 


.01 


.14 


-.02 


.14 




Reg. t 


i.ii 


.20 


2.40* 


-.30 


1.40 




Step. Coef. 






.16 








Step, t 






3.60*** 




.. 
















Learning Slope: 


Corr. 


.12 


.01 


.13 


-.11 


.20* 




Reg. Coef. 


.32 


.48 


.44 


.05 


1.21 




Reg. t 


2.20* 


2.89** 


2.36* 


.20 


3.79*** 




Step. Coef. 




.29 


.39 




.89 




Step. 1 




2.29* 


2.29** 




3.68*** 
















Cost Variability 


Corr. 


.09 


.35*** 


.06 


.26** 


.07 




Reg. Coef. 


.06 


.35 


.04 


.38 


.27 




Reg. t 


.76 


3.58*** 


.42 


2.87** 


1.43 




Step. Coef. 




.35 




.32 


.42 




Step, t 




4.68*** 




3.48*** 


2.87** 
















Quantity 

Variability 


Corr. 


-.10 


.03 


-.15 


-.15 


-.09 




Reg. Coef. 


-.07 


-.03 


-.01 


-.14 


-.01 




Reg. t 


-1.31 


-.54 


-.23 


-1.77 


-.13 




Step. Coef. 














Step, t 













Table 2 Continued 



Quantity Trend 


Corr. 


-.05 


.09 


-.09 


.06 


-.04 




Reg. Coef. 


.04 


.06 


.01 


-.01 


.09 




Reg. t 


1.31 


1.88 


.31 


-.16 


1.50 




Step. Coef. 














Step, t 


























Plot Points 


Corr. 


.06 


.01 


-.08 


-.15 
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model . ) 

Correlations, regression coefficients and t values from these 
three approaches are provided in Table 2. Several observations 
concerning Table 2 follow. 

First, where results are strong (significant at a higher level 
of probability) in one of the three tests, they tend to be 
corroborated in the other two tests. So there is at least some 
convergence across the tests. 

Second, for three of the seven conditions (Quantity 
Variability, Quantity Trend and Plot Points) there are no 
significant results and thus no indication that model accuracy 
depends on these factors. This is of interest simply because all 
of the factors in this study have been shown to impact accuracy in 
at least one of the simulation studies cited previously. Of 
particular interest is the non-result for Plot Points. For none of 
the five models does the magnitude of forecast error depend on the 
number of observations in the cost series used to estimate the 
model. This suggests that the degrees of freedom problem in model 
estimation mentioned earlier is not the likely explanation for some 
models performing better or worse than others. 

Third, significant results are found for the other four 
condition variables, and these results are not limited to single 
models. Rather, the accuracy of several of the models (at least 
three of the five) are related to these conditions. 

How these conditions affect the accuracy of individual models 
differs from model to model, however. What follows is a model-by- 
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model look at the impact of the conditions. The approach used was 
to partition the sample into three subsamples depending on whether 
the values for a condition variable were low (bottom quartile), 
medium (middle 50%), or high (top quartile) and then, for each 
model, observe and plot average values for ERROR for these three 
subsamples. This approach is followed below for variables found 
significant in the Table 2 tests. 

Error Analysis for Each Model 

1 . Random Walk Model : The Table 2 tests showed that RW model 
accuracy depended on two conditions — Learning Curve Slope and 
Future Production Rate — so the sample was partitioned 
(separately) on each of these two variables and average ERROR from 
the RW model determined for each of the three subsamples. Plots 
showing RW ERROR as a function of these two condition variables are 
in Figure 1. A horizontal line in the plot marks the overall 
average RW ERROR, so movement above and below this line indicates 
the impact of differing conditions. 

First, RW ERROR depends somewhat on the Learning Curve Slope 
exhibited in the cost data, with greater ERROR experienced when 
learning slopes are high — i.e., when little learning apparently 
has occurred. The fact that RW ERROR depends on the degree of 
learning is not surprising; the RW model ignores learning and hence 
the degree to which it mis-forecasts cost ought to depend on the 
degree of learning occurring in the cost series. But the observed 
pattern is the opposite of the expected one. One would expect the 
RW ERROR to be greater when more learning was taking place, not 
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RANDOM WALK MODEL 

PLOT OF FORECAST ERROR BY LEVELS OF CONDITION VARIABLES 
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less. The degree to which ERROR depends on learning slope is 
admittedly small, but the reason for the particular pattern is not 
obvious . 

Second, RW ERROR depends on the Future Production Rate. Note 
the pattern is not monotonic; ERROR is higher than average for low 
values of future production, dips below average for mid-range 
values, and increases substantially for high values. This pattern 
is quite interesting but, as will be seen, it is repeated for all 
of the models and will be discussed later. 

2 . Learning Curve Model : Figure 2 shows how the accuracy of 
the traditional learning curve depends on Burden, Learning Curve 
Slope and Cost Variability. The role of Burden seem straight- 
forward: The LC model does not include a production rate variable, 
and one of the roles of a rate variable is to deal with the effect 
of spreading fixed overhead burden over varying levels of output. 
The LC model should be expected to perform more poorly when the 
level of burden is high. 

That the accuracy of the LC model should also depend on the 
degree of learning estimated by the model is somewhat interesting. 
The effect shown in Figure 2 is mild but shows that LC ERROR is 
slightly higher when estimated learning rates are in either the 
bottom or top quartiles. A fuller story comes from observing BIAS 
rather than ERROR. When much learning appears to be occurring, the 
LC model under-estimates future cost (average BIAS of -15%). When 
little learning appears to be occurring, the LC model over- 
estimates future cost (average BIAS of +12%). What seems to be 
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happening is a "regression to the mean" effect. A high (low) rate 
of past cost reduction causes the model to forecast a high (low) 
rate of future cost reduction and, in each case, the high (low) 
rate regresses to a more average rate, causing consistent over-or 
under-estimation of future cost. 

A more pronounced effect occurs for Cost Variability, with a 
sharp increase in LC ERROR when past costs have varied greatly from 
period to period. This finding is consistent with past simulation 
results suggesting that LC models try to explain all variability in 
cost through the estimation of the single learning parameter and, 
when there is considerable period to period "noise" in the cost 
series, end up erroneously "interpreting" that noise in the 
estimated learning rate. 

3. Rate Adjustment Model: Figure 3 shows how the accuracy of 
the rate adjustment model depends on Burden, Learning Curve Slope 
and Future Production Rate. The figure shows that RA model ERROR 
increases as the fixed overhead burden increases. Although 
statistically significant, the effect is mild. It is also not 
obvious why this should occur. The approach of the RA model is to 
adjust unit cost for the effect of spreading fixed cost burden over 
varying output volume. The evidence here indicates that the 
ability of this model to properly adjust depends on how much fixed 
overhead there is. 

The finding that RA ERROR (mildly) depends on Learning Curve 
Slope, or at least the direction of the finding, is unexpected. 
Since the RA model ignores learning, one would expect ERROR to be 
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LEARNING CURVE MODEL 

PLOT OF FORECAST ERROR BY LEVELS OF CONDITION VARIABLES 
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greatest when learning was greatest (lowest slope values). The 
opposite effect is exhibited. 

The biggest impact on RA ERROR is due to differences in Future 
Production Rate. As noted when discussing the RW model, a "V" 
shaped pattern occurs, with ERROR growing as Future Production Rate 
diverges from the middle range. Again, this will be discussed 
later. 

4. Bemis Model: Figure 4 shows how the accuracy of the Bemis 
model depends on Burden, Cost Variability and Future Production 
Rate. BE model ERROR increases with increases in Burden. This 
positive relationship is the same as just noted for the RA model, 
as is the interpretation. In both cases, the model includes a rate 
term which is designed in part to capture the effect of spreading 
fixed cost burden over differing output volume. In both cases, the 
model's accuracy declines as the amount of Burden increases. 

As Figure 4 shows, BE ERROR also is larger when there is 
relatively greater period-to-period variation in cost. The BE 
model is the same as the LC model, with a rate term tacked on, and 
this finding is shared with the LC model (and a similar explanation 
may apply) . 

Lastly, BE ERROR also depends on Future Production Rate, with 
the same "V" shaped pattern to be discussed later. 

5 . Balut Model : Figure 5 shows that the accuracy of the 
Balut model depends on Learning Curve Slope, Cost Variability and 
Future Production Rate. BE ERROR tends to be considerably smaller 
when learning is great (lower Learning Curve Slope values). Two 
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offsetting effects may explain this. First, models with learning 
variables tend to be biased toward under- forecasting of future cost 
when the apparent learning is great (because of the regression-to- 
the-mean effect, previously discussed). Second, overall, the BA 
model tends to be biased toward over-forecasting of future cost (as 
seen in Table 1). These two effects offset, resulting in more 
accurate forecasts for the subsample where learning is great. 
(BIAS turned out to be essentially zero for this subsample and 
about +5-8% for the other two). 

BA accuracy is also dependent on Cost Variability, although 
the finding is only mildly significant. Figure 5 shows greater 
ERROR when Cost Variability is in the middle range; there is no 
obvious explanation for this non-monotonic inverted "V" pattern. 

Lastly, BA ERROR is also dependent on the Future Production 
Rate, with the now familiar "V" pattern. This general result will 
be discussed next. 

The Impact of Future Production Rate 

Of the seven condition variables, Future Production Rate is 
special for four reasons. First, conceptually it is distinct. The 
other six variables describe conditions existing during the periods 
over which the models are estimated — i.e., the past. In 
contrast, Future Production Rate describes a condition (the level 
of production) expected to exist during the period for which cost 
is being forecast. Second, how models perform in situations where 
production rates are changing is of particular importance for 
today's cost analyst, facing cost forecasting problems in an 
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environment of rapid industrial change, such as production rate 
cutbacks in the defense industry. Third, the previous results have 
shown that in general the largest swings in average ERROR occur 
when moving across the subsamples partitioned on Future Production 
Rate. Last, the pattern of errors is consistent and non-monotonic, 
a V-shaped pattern with top and bottom quartile values for Future 
Production Rate associated with larger ERROR. Figure 6 summarizes 
this finding for all five models. 

What does the V-shaped pattern mean. Simply put, if 
production rate in the period for which cost is being forecast 
diverges much from the recent past, either up or down, the accuracy 
of all five of the models deteriorates. This is not a surprising 
finding for models 1 and 2, the random walk (RW) and traditional 
learning curve (LC), because neither model incorporates production 
rate as a variable. But the fact that the RA, BE and BA models 
exhibit the same pattern indicates that the attempts of these 
models to explicitly capture production rate effects have not been 
fully successful. 

Given that all the models mis-forecast cost when future 
production rate changes, a related question is: In what direction? 
This can be answered by observing values for BIAS, which are 
plotted in Figure 7. Some patterns from Figure 7 are of interest: 
First, the RW and LC models both under-estimate cost (negative 
BIAS) when future production rate falls and over-estimate cost when 
future production rate rises. This is not surprising. Falling 
rate should increase actual unit cost, because fixed capacity costs 



37 



are spread over less output. The RW and LC models "miss" this 
effect and thus consistently under-estimate unit cost. The 
opposite effect occurs when production rate increases, leading to 
over-estimates of unit cost. 

BIAS for the BE model goes from slightly positive to slightly 
negative as Future Production Rate increases, but the effect is 
mild and insignificant. This is consistent with other 
investigations of this model which showed that, although the 
magnitude of error may vary across conditions, the BE model is 
consistently unbiased (Moses, 1992). 

The impact of Future Production Rate on BIAS from the RA and 
BA models is more dramatic and significant — and difficult to 
explain. Changing Future Production Rate in either direction, up 
or down, causes these models to over-estimate cost (positive BIAS). 
Both the RA and BA models "handle" rate changes in the same way, 
using the rate adjustment factor developed by Balut. But why this 
factor might lead to consistent over-estimation of cost, regardless 
of whether future production rate rises or falls, is not obvious. 

Comparisons of Model Accuracy 

Given that the accuracy of the five models depends on the 
conditions under which they are used, an inevitable question 
arises: Which model appears to perform "best" under which 
conditions? Table 3 ranks the models by median ERROR, both overall 
(full sample) and by subsamples partitioned on values of the seven 
condition variables. Several observations seem noteworthy from 
these comparisons. 



38 



C_J> 



CO 



CO 




t- 


CN 


CO 




LO 


co 


CO 


CO 


CO 


CO 


LU 


UJ 


LU 


UJ 


LU 


CC 


CC 


CC 


CC 


CC 


LU 


UJ 


UJ 


UJ 


UJ 


CO 


CO 


CO 


CO 


CO 


II 


II 


II 


II 


II 


r- 


CN 


CO 




LO 


_J 


_J 


-J 


—1 


_J 


UJ 


UJ 


UJ 


UJ 


UJ 


Q 


Q 


Q 


Q 


Q 


O 


O 


o 


o 


o 











UJ 

Q 

O 



uj Q 
Q O 

O 2 

^ H 






uu 

> “ 

- 1 5 ^ _) -l 
O o 



Z. o , 

O 5 Z Q 
z g z < 

UJ ( - J CC LU 

oz<t 

UJ << UJ ^ 



i 3 

UJ < 
CD CD 




<>ujQC<0iu c? 



0 c uj 0 < c/) h uj i 1 0 or 



LOW MIDDLE HIGH 

LEVEL OF FUTURE PRODUCTION RATE 



CO 



CO 

o 



r“ 


CM 


co 


* 


IA 


a 


0) 


a 


a 


0 


« 


0 


9 


0 


9 


® 


0 


• 


0 


9 


(A 


(/) 


(A 


(A 


(A 


♦ 


+ * 


♦ 


X 



r— CM 00 rf 



Q C CC CC CC 



II II II 

^ s s 



Q Q Q 

o o o 
2 2 2 



-J Q 

LLI 

o i 

o ^ 



G 

O 

2 



^ oc 2 

pi 

Q ? < V) 



o ^ < 

UJ ^ 

-I QC 



QC LU 



5 

<r 




# u.OimU<mi- m _ < w 



LOW MIDDLE HIGH 

LEVEL OF FUTURE PRODUCTION RATE 



Table 3 

Ranking of Alternative Cost Progress 
Models in Terms of Median Error 
(Most accurate= 1, least= 5) 
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First is the consistent domination of the RW model, ranking 
most accurate overall and in all but a couple of the subsamples. 

Next, is the "second place" showing for the RA model. It is 
second most accurate overall (and in a majority of the subsamples) 
and tends to be the model that outperforms the RW when the RW is 
not most accurate. This showing for the RA model is a bit 
surprising. The model is an abbreviated (no learning) version of 
the Balut (1981) model, and was created for this study simply to 
include, and test, a model incorporating rate changes but not 
learning. This model easily outperformed the "full" Balut model 
(#5) suggesting that Balut's contribution to modeling, the rate 
adjustment factor, may be even more useful when left "unattached" 
to the learning curve. 

Third is the tendency for the models that required estimation 
of a learning rate (the LC, BE and BA models) to perform less well. 

Last, there is a general pattern: An inverse relationship 
between accuracy and the number of variables in a model: The LC 
and RA models incorporate one variable more than the RW model 
(either cumulative quantity or production rate) and accuracy 
declines. The BE and BA models incorporate two additional 
variables (both cumulative quantity and production rate) and 
accuracy declines some more. 

CONCLUSIONS AND FINAL COMMENTS 

The objective of this paper has been to document the accuracy 
of five familiar cost progress models under varying conditions, 
using cost data from real world programs. Accuracy was evaluated 
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in terms of ability to forecast next-period unit cost. Data 
consisted of annual lot costs from 46 military aerospace programs, 
arranged so that models were used to forecast 121 next-period 
costs. The five cost progress models forecasted future cost using 
some combination of variables reflecting (a) past costs, (b) 
cumulative quantity, and (c) production rate. Specific findings 
and error patterns have been presented; broader conclusions follow: 

1. The accuracy of all cost progress models (tested) does 
depend on the circumstances or conditions in which they are used. 
Those conditions can be identified in advance. Thus a cost 
estimator using a particular model may be able to assess the risk 
of forecast error depending on the conditions. 

2. Which conditions affect accuracy, and by how much, varies 
somewhat from model to model. But the results suggest that the 
amount of fixed cost burden, the degree of apparent learning, the 
degree of past variability in period-to-period cost and, 
particularly, the nature and degree of change in the future 
production rate provide information that can inform a cost 
estimator about the risk of forecast error from using a particular 
model . 

3 . It is not obvious that more sophisticated cost progress 
models improve forecasting. Quite the contrary for the sample 
here; forecast accuracy declined as additional variables were 
included . 

4. Attempts by the models in this study to deal with the 
effects of changing production rate (of particular interest 
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currently, given the changing industrial picture) do not appear to 
have been very successful. This conclusion follows from the 
relatively poorer accuracy of the BE and BA models and from the 
fact that error for all the models increased when future production 
rates varied from the past. The model that did the best at 
(explicitly) adjusting forecasts for rate changes seems to be the 
simpler RA model and further study of the usefulness of this model 
seems warranted. 

5. Although a relatively large sample of aerospace programs 
was included, all of the findings and conclusions should be 
tempered by the acknowledgement that they came from tests on one 
set of data — cost data that was at a high level of aggregation 
(annual lot costs) and reasonably lean (the maximum data points for 
fitting a model was 13). Results would likely be most 
generalizable to similar cost forecasting situations. On the other 
hand, many of the error patterns observed in this study have also 
been observed in previous studies evaluating models on simulated 
data, so it is unlikely that the error patterns observed can be 
discounted as simply sample specific. Perhaps some of the findings 
may be viewed as tentative — as hypotheses to be additionally 
supported (or contradicted) by future research. Given the findings 
of this study, one direction such research might take would be to 
start with the following question: Under what circumstances can 
more complex cost progress models outperform the simple random walk 
model? 
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1. Yelle (1979) reviews the literature, with an emphasis on 
applications of the learning curve approach. Dutton and Thomas 
(1984) provide a more recent review, identifying and categorizing 
the factors that cause the learning phenomenon. Teplitz (1991) 
provides a comprehensive practical introduction to using learning 
curves, including a discussion of modeling problems and curve 
forms. 

2. One review of the literature pertaining to learning curves 
(Cheney, 1977) found that 36% of the articles reviewed attempted to 
augment the learning curve model in some manner by the inclusion of 
production related variables. 

3. Several explanations for these varying, inconclusive empirical 
results can be offered: (a) Varying results are to be expected 
because rate changes can lead to both economies and diseconomies of 
scale. (b) Production rate effects are difficult to isolate 
empirically because of colinearity with cumulative quantity 
(Gulledge and Womer, 1986). (c) Researchers have usually used 
inappropriate measures of production rate leading to misspecified 
models (Boger and Liao, 1990). (d) The impact of a production 
rate change is dominated by other uncertainties (Large, Hoffmayer, 
and Kontrovich, 1974), particularly by cumulative quantity (Asher, 
1956). Alchian (1963), for example, was unable to find results for 
rate adjustment models that improved on the traditional learning 
curve without a rate parameter. 

4 . Note that this is an incremental unit cost model rather than a 
cumulative average cost model. Liao (1988) discusses the 
differences between the two approaches and discusses why the 
incremental model has become dominant in practice. One reason is 
that the cumulative model weights early observations more heavily 
and, in effect, "smooths" away period-to-period changes in average 
cost. 

5. Empirically a value for F of 14.7% was used. This figure comes 
from Balut (1981) and is an average derived from aerospace industry 
data during the late 1970s. 

6. Readers familiar with the Balut modeling approach will recall 
that cost estimates were made using a learning curve and then 
adjust using the overhead redistribution model. Learning is 
ignored here by design. The intent is to present a model which 
reflects changes in production rate only (i.e., a model from 
category 3 listed previously). Model 5 will reincorporate 
learning. 

7. There was no need to estimate parameters for models 1 and 3. 
Variables can just be plugged in to create a cost forecast. Models 
2 and 4 were estimated using standard linear regression on logged 
variables. Estimating model 5 on real data and making a cost 
forecast using the model involved several steps: (1) An average 
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production rate (R a ) for all past lots was calculated as a 
reference. (2) Adjustment factors (A at ) for each lot were 
calculated as a function of differences between lot production rate 
(RJ and the average rate (R a ) . (3) Actual past unit costs were 
transformed using the adjustment factors to the unit costs they 
"would have been" if the production rate had not differed from the 
average. (4) Traditional learning curves were fit to these 
transformed costs to estimate learning curve parameters. (5) The 
learning curve was used to forecast future cost, assuming future 
production rate would be average. (6) Future unit cost was 
adjusted if the production rate in the future period differed from 
the average. 

8. Some research (e.g., Moses, 1993) has shown there may be little 
association between a cost model's ability to explain past costs 
and its ability to forecast future costs. Higher R 2 (better 
explanation) can always be achieved by adding variables to a model 
but high R 2 may be a poor indicator of forecast accuracy. 
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